智能论文笔记

Stable and Transferable Hyper-Graph Neural Networks

Mikhail Hayhoe , Hans Riess , Victor M. Preciado , Alejandro Ribeiro

分类：机器学习

2022-11-11

We introduce an architecture for processing signals supported on hypergraphs via graph neural networks (GNNs), which we call a Hyper-graph Expansion Neural Network (HENN), and provide the first bounds on the stability and transferability error of a hypergraph signal processing model. To do so, we provide a framework for bounding the stability and transferability error of GNNs across arbitrary graphs via spectral similarity. By bounding the difference between two graph shift operators (GSOs) in the positive semi-definite sense via their eigenvalue spectrum, we show that this error depends only on the properties of the GNN and the magnitude of spectral similarity of the GSOs. Moreover, we show that existing transferability results that assume the graphs are small perturbations of one another, or that the graphs are random and drawn from the same distribution or sampled from the same graphon can be recovered using our approach. Thus, both GNNs and our HENNs (trained using normalized Laplacians as graph shift operators) will be increasingly stable and transferable as the graphs become larger. Experimental results illustrate the importance of considering multiple graph representations in HENN, and show its superior performance when transferability is desired.

translated by 谷歌翻译

Multidimensional Persistence Module Classification via Lattice-Theoretic Convolutions

Hans Riess , Jakob Hansen , Robert Ghrist

分类：机器学习

2020-11-28

多参数持续的同源性在很大程度上被忽略为机器学习算法的输入。我们考虑使用基于晶格的卷积神经网络层作为分析多参数持续模块引起的特征的工具。我们发现，这些表现有望作为对多维持久模块分类的卷积的替代方法。

translated by 谷歌翻译

Automated Gadget Discovery in Science

Lea M. Trenkwalder , Andrea López Incera , Hendrik Poulsen Nautrup , Fulvio Flamini , Hans J. Briegel

分类：人工智能 | 机器学习

2022-12-24

In recent years, reinforcement learning (RL) has become increasingly successful in its application to science and the process of scientific discovery in general. However, while RL algorithms learn to solve increasingly complex problems, interpreting the solutions they provide becomes ever more challenging. In this work, we gain insights into an RL agent's learned behavior through a post-hoc analysis based on sequence mining and clustering. Specifically, frequent and compact subroutines, used by the agent to solve a given task, are distilled as gadgets and then grouped by various metrics. This process of gadget discovery develops in three stages: First, we use an RL agent to generate data, then, we employ a mining algorithm to extract gadgets and finally, the obtained gadgets are grouped by a density-based clustering algorithm. We demonstrate our method by applying it to two quantum-inspired RL environments. First, we consider simulated quantum optics experiments for the design of high-dimensional multipartite entangled states where the algorithm finds gadgets that correspond to modern interferometer setups. Second, we consider a circuit-based quantum computing environment where the algorithm discovers various gadgets for quantum information processing, such as quantum teleportation. This approach for analyzing the policy of a learned agent is agent and environment agnostic and can yield interesting insights into any agent's policy.

translated by 谷歌翻译

Berlin V2X: A Machine Learning Dataset from Multiple Vehicles and Radio Access Technologies

Rodrigo Hernangómez , Philipp Geuer , Alexandros Palaios , Daniel Schäufele , Cara Watermann , Khawla Taleb-Bouhemadi , Mohammad Parvini , Anton Krause , Sanket Partani , Christian Vielhaus

分类：机器学习 | 人工智能

2022-12-20

The evolution of wireless communications into 6G and beyond is expected to rely on new machine learning (ML)-based capabilities. These can enable proactive decisions and actions from wireless-network components to sustain quality-of-service (QoS) and user experience. Moreover, new use cases in the area of vehicular and industrial communications will emerge. Specifically in the area of vehicle communication, vehicle-to-everything (V2X) schemes will benefit strongly from such advances. With this in mind, we have conducted a detailed measurement campaign with the purpose of enabling a plethora of diverse ML-based studies. The resulting datasets offer GPS-located wireless measurements across diverse urban environments for both cellular (with two different operators) and sidelink radio access technologies, thus enabling a variety of different studies towards V2X. The datasets are labeled and sampled with a high time resolution. Furthermore, we make the data publicly available with all the necessary information to support the on-boarding of new researchers. We provide an initial analysis of the data showing some of the challenges that ML needs to overcome and the features that ML can leverage, as well as some hints at potential research studies.

translated by 谷歌翻译

Hybrid adiabatic quantum computing for tomographic image reconstruction -- opportunities and limitations

Merlin A. Nau , A. Hans Vija , Wesley Gohn , Maximilian P. Reymann , Andreas K. Maier

分类：计算机视觉

2022-12-02

Our goal is to reconstruct tomographic images with few measurements and a low signal-to-noise ratio. In clinical imaging, this helps to improve patient comfort and reduce radiation exposure. As quantum computing advances, we propose to use an adiabatic quantum computer and associated hybrid methods to solve the reconstruction problem. Tomographic reconstruction is an ill-posed inverse problem. We test our reconstruction technique for image size, noise content, and underdetermination of the measured projection data. We then present the reconstructed binary and integer-valued images of up to 32 by 32 pixels. The demonstrated method competes with traditional reconstruction algorithms and is superior in terms of robustness to noise and reconstructions from few projections. We postulate that hybrid quantum computing will soon reach maturity for real applications in tomographic reconstruction. Finally, we point out the current limitations regarding the problem size and interpretability of the algorithm.

translated by 谷歌翻译

An introduction to optimization under uncertainty -- A short survey

Keivan Shariatmadar , Kaizheng Wang , Calvin R. Hubbard , Hans Hallez , David Moens

分类：人工智能

2022-12-01

Optimization equips engineers and scientists in a variety of fields with the ability to transcribe their problems into a generic formulation and receive optimal solutions with relative ease. Industries ranging from aerospace to robotics continue to benefit from advancements in optimization theory and the associated algorithmic developments. Nowadays, optimization is used in real time on autonomous systems acting in safety critical situations, such as self-driving vehicles. It has become increasingly more important to produce robust solutions by incorporating uncertainty into optimization programs. This paper provides a short survey about the state of the art in optimization under uncertainty. The paper begins with a brief overview of the main classes of optimization without uncertainty. The rest of the paper focuses on the different methods for handling both aleatoric and epistemic uncertainty. Many of the applications discussed in this paper are within the domain of control. The goal of this survey paper is to briefly touch upon the state of the art in a variety of different methods and refer the reader to other literature for more in-depth treatments of the topics discussed here.

translated by 谷歌翻译

Eat-Radar: Continuous Fine-Grained Eating Gesture Detection Using FMCW Radar and 3D Temporal Convolutional Network

Chunzhuo Wang , T. Sunil Kumar , Walter De Raedt , Guido Camps , Hans Hallez , Bart Vanrumste

分类：计算机视觉

2022-11-08

Unhealthy dietary habits are considered as the primary cause of multiple chronic diseases such as obesity and diabetes. The automatic food intake monitoring system has the potential to improve the quality of life (QoF) of people with dietary related diseases through dietary assessment. In this work, we propose a novel contact-less radar-based food intake monitoring approach. Specifically, a Frequency Modulated Continuous Wave (FMCW) radar sensor is employed to recognize fine-grained eating and drinking gestures. The fine-grained eating/drinking gesture contains a series of movement from raising the hand to the mouth until putting away the hand from the mouth. A 3D temporal convolutional network (3D-TCN) is developed to detect and segment eating and drinking gestures in meal sessions by processing the Range-Doppler Cube (RD Cube). Unlike previous radar-based research, this work collects data in continuous meal sessions. We create a public dataset that contains 48 meal sessions (3121 eating gestures and 608 drinking gestures) from 48 participants with a total duration of 783 minutes. Four eating styles (fork & knife, chopsticks, spoon, hand) are included in this dataset. To validate the performance of the proposed approach, 8-fold cross validation method is applied. Experimental results show that our proposed 3D-TCN outperforms the model that combines a convolutional neural network and a long-short-term-memory network (CNN-LSTM), and also the CNN-Bidirectional LSTM model (CNN-BiLSTM) in eating and drinking gesture detection. The 3D-TCN model achieves a segmental F1-score of 0.887 and 0.844 for eating and drinking gestures, respectively. The results of the proposed approach indicate the feasibility of using radar for fine-grained eating and drinking gesture detection and segmentation in meal sessions.

translated by 谷歌翻译

MONAI: An open-source framework for deep learning in healthcare

M. Jorge Cardoso , Wenqi Li , Richard Brown , Nic Ma , Eric Kerfoot , Yiheng Wang , Benjamin Murrey , Andriy Myronenko , Can Zhao , Dong Yang

分类：机器学习 | 人工智能 | 计算机视觉

2022-11-04

Artificial Intelligence (AI) is having a tremendous impact across most areas of science. Applications of AI in healthcare have the potential to improve our ability to detect, diagnose, prognose, and intervene on human disease. For AI models to be used clinically, they need to be made safe, reproducible and robust, and the underlying software framework must be aware of the particularities (e.g. geometry, physiology, physics) of medical data being processed. This work introduces MONAI, a freely available, community-supported, and consortium-led PyTorch-based framework for deep learning in healthcare. MONAI extends PyTorch to support medical data, with a particular focus on imaging, and provide purpose-specific AI model architectures, transformations and utilities that streamline the development and deployment of medical AI models. MONAI follows best practices for software-development, providing an easy-to-use, robust, well-documented, and well-tested software framework. MONAI preserves the simple, additive, and compositional approach of its underlying PyTorch libraries. MONAI is being used by and receiving contributions from research, clinical and industrial teams from around the world, who are pursuing applications spanning nearly every aspect of healthcare.

translated by 谷歌翻译

Synthesizing Annotated Image and Video Data Using a Rendering-Based Pipeline for Improved License Plate Recognition

Andreas Spruck , Maximilane Gruber , Anatol Maier , Denise Moussa , Jürgen Seiler , Christian Riess , André Kaup

分类：计算机视觉

2022-09-28

在神经网络应用中，不足的培训样本是一个常见的问题。尽管数据增强方法至少需要最少数量的样本，但我们提出了一种基于新颖的，基于渲染的管道来合成带注释的数据集。我们的方法不会修改现有样本，而是合成全新样本。提出的基于渲染的管道能够在全自动过程中生成和注释合成和部分真实的图像和视频数据。此外，管道可以帮助获取真实数据。拟议的管道基于渲染过程。此过程生成综合数据。部分实现的数据使合成序列通过在采集过程中合并真实摄像机使综合序列更接近现实。在自动车牌识别的背景下，广泛的实验验证证明了拟议的数据生成管道的好处，尤其是对于具有有限的可用培训数据的机器学习方案。与仅在实际数据集中训练的OCR算法相比，该实验表明，角色错误率和错过率分别从73.74％和100％和14.11％和41.27％降低。这些改进是通过仅对合成数据训练算法来实现的。当另外合并真实数据时，错误率可以进一步降低。因此，角色错误率和遗漏率可以分别降低至11.90％和39.88％。在实验过程中使用的所有数据以及针对自动数据生成的拟议基于渲染的管道公开可用（URL将在出版时揭示）。

translated by 谷歌翻译

3D Rendering Framework for Data Augmentation in Optical Character Recognition

Andreas Spruck , Maximiliane Hawesch , Anatol Maier , Christian Riess , Jürgen Seiler , André Kaup

分类：计算机视觉

2022-09-27

在本文中，我们提出了一个用于光学特征识别（OCR）的数据增强框架。所提出的框架能够合成新的视角和照明方案，从而有效地丰富任何可用的OCR数据集。它的模块化结构允许修改以符合单个用户需求。该框架使得可以舒适地扩展可用数据集的扩大因子。此外，所提出的方法不仅限于单帧OCR，但也可以应用于视频OCR。我们通过扩大普通BRNO移动OCR数据集的15％子集来证明框架的性能。我们提出的框架能够利用OCR应用程序的性能，尤其是对于小型数据集。应用提出的方法，在字符错误率（CER）方面提高了多达2.79个百分点，并在子集中获得了高达7.88个百分点。特别是可以改善对具有挑战性的文本线条的认识。该类别的CER可能会降低14.92个百分点，而该级别的CER可下降到18.19个百分点。此外，与原始的非仪式完整数据集相比，使用建议方法的15％子集进行训练时，我们能够达到较小的错误率。

translated by 谷歌翻译